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Quantum random number generators can provide genuine randomness by appealing to the fun¬ 
damental principles of quantum mechanics. In general, a physical generator contains two parts—a 
randomness source and its readout. The source is essential to the quality of the resulting random 
numbers; hence, it needs to be carefully calibrated and modeled to achieve information-theoretical 
provable randomness. However, in practice, the source is a complicated physical system, such as 
a light source or an atomic ensemble, and any deviations in the real-life implementation from the 
theoretical model may affect the randomness of the output. To close this gap, we propose a source- 
independent scheme for quantum random number generation in which output randomness can be 
certified, even when the source is uncharacterized and untrusted. In our randomness analysis, we 
make no assumptions about the dimension of the source. For instance, multiphoton emissions are 
allowed in optical implementations. Our analysis takes into account the finite-key effect with the 
composable security definition. In the limit of large data size, the length of the input random seed is 
exponentially small compared to that of the output random bit. In addition, by modifying a quan¬ 
tum key distribution system, we experimentally demonstrate our scheme and achieve a randomness 
generation rate of over 5 x 10 3 bit/s. 


I. INTRODUCTION 

Random numbers play important roles in many fields, 
such as scientific simulation |I| , cryptography Q, testing 
fundamental principles of physics )3], and lotteries. Dif¬ 
ferent applications require different levels of randomness. 
In cryptography, input randomness is one of the security 
foundations in communication protocols. In fact, many 
commercial products for generating random numbers ex¬ 
ist in the market; such products function under various 
information-theoretical or computational assumptions. 

In computer science, random number generators 
(RNGs) are based on generating pseudorandom num¬ 
bers [I] in which a random seed is expanded according 
to some deterministic procedure. By definition, these 
RNGs produce sequences that are not truly random. Al¬ 
though these sequences usually attain a perfect balance 
between Os and Is, strong long-range correlations exist 
which undermine cryptographic security and may cause 
unexpected errors in scientific simulations. 

In contrast, hardware RNGs originating from physi¬ 
cal processes, such as noise in electric devices, nuclear 
fission, and circuit and radial decay m , are believed 
to be able to offer better random numbers. However, it 
is unclear whether they are truly random because these 
RNGs normally involve complicated classical physics pro¬ 
cesses that produce no randomness. 

To solve this problem, the new field of quantum ran¬ 
dom number generators (QRNGs) has emerged. These 
generators stem from the uncertainty principle in quan¬ 
tum mechanics and are therefore inherently random. Ex¬ 
isting QRNG methods include single photon detection 
Il2j-ll5|. vacuum state fluctuations [lfj, and quantum 
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phase fluctuations E3- These approaches have developed 
to the point that some commercial QRNG products are 
available (l 5 l. I18I - I2H . 

A typical QRNG can be decomposed into two modules: 
a randomness source (quantum state preparation) and its 
readout (measurement), as shown in Fig. [lj In general, 
the source emits quantum states that are superpositions 
of the measurement basis. The output (raw) random 
numbers are the measurement results. In many QRNGs, 
a short random seed is required to assist state preparation 
or measurement. 
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FIG. 1. Illustration of a generic QRNG setup in which we 
take photon polarization as the example. H and V refer to 
horizontal and vertical polarizations, respectively. PBS refers 
to a polarizing beam splitter, (a) The source functions nor¬ 
mally (or trusted) and sends superpositions of H and V polar¬ 
izations, which offers quantum randomness, (b) The source 
malfunctions (or untrusted) and sends H and V polarizations 
in a predetermined order, which should output no genuine 
randomness. From the measurement result viewpoint, one 
cannot distinguish these two cases. 

As an example, consider a simple QRNG that projects 
the quantum state |+) = (| H) + \ V))/y/2 emitted from 
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a single photon source on the horizontal and vertical po¬ 
larization basis |if), |V"). This QRNG can be divided 
into two modules, as shown in Fig. QJa). Randomness 
is guaranteed by the intrinsically probabilistic nature of 
quantum physics. Hereafter, we denote |if), |V) (|+), 
|—)) as the Z-basis (X-basis) eigenstates. 

Existing practical QRNGs suffer from security loop¬ 
holes if the devices are not perfect. In the source readout 
model, the measurement device can normally be trusted 
due to its simple structure. For instance, in the pre¬ 
vious example, the measurement is a simple demolition 
measurement on the polarization basis. In contrast, the 
randomness contained in a source, such as a laser or 
an atomic ensemble, is normally difficult to characterize 
completely. If the source malfunctions and emits classical 
signals instead of quantum ones, the outputs may not be 
truly random. Consider the worst-case scenario in which 
the devices are designed or controlled by an adversary 
Eve. Eve can employ a pseudo-RNG to output a fixed 
(from Eve’s viewpoint) string that still appears random 
to Alice. More concretely, in the example of the previous 
paragraph, when a dishonest source emits Z-basis instead 
of X-basis eigenstates for the Z-basis measurement, the 
output will just be a fixed string, as shown in Fig. [ljb). 
From this perspective, with given measurement devices, 
justifying the randomness in a source is crucial to gener¬ 
ating randomness. 

Most existing QRNGs use complicated physical mod¬ 
els [13, HH to quantify their sources. For example, the 
dimension of the source is sometimes assumed to be a 
fixed known number [23| • The underlying models implic¬ 
itly assume the existence of randomness in the first place, 
but this assumption cannot be verified experimentally. 
Therefore, to achieve truly reliable randomness, there is 
a strong motivation to avoid the use of such models. Note 
that removing the dimension assumption is the key chal¬ 
lenge to the analysis for device-independent scenarios. 

Thus, a QRNG without trusting the source (source- 
independent) is both theoretically and practically mean¬ 
ingful and greatly needed. A device-independent QRNG 
|2J] can generate randomness without having to trust the 
devices. This type of QRNG requires a short seed for de¬ 
vice testing, which is the reason why they are also called 
randomness expansions [25l - l27l |. By observing the viola¬ 
tion of a certain Bell’s inequality, such as the Clauser- 
Horne-Shimony-Holt inequality [281 ] . one can guarantee 
the presence of randomness without any assumptions 
about the source or the measurement device. The main 
drawback of device-independent QRNGs is that they are 
not loss tolerant, which typically imposes very severe re¬ 
quirements on experimental devices. Furthermore, this 
type of QRNG generates random numbers at rates that 
are very low for practical applications. The highest speed 
of this type of QRNG has, so far, been reported to be 
0.4 bps (29]. 

Here, we propose a source-independent QRNG 
(SIQRNG) scheme that is loss tolerant and hence highly 
practical. In particular, our scheme allows the source 


to have arbitrary and unknown dimensions. The loss- 
tolerance property enables potential high-loss implemen¬ 
tations of our scheme, such as in integrated optic chips 
or with inefficient but cheap single photon detectors. We 
analyze the randomness of the scheme based on comple¬ 
mentary uncertainty relations. Our analysis takes into 
account several practical issues, including finite-key-size 
effects, multiphoton components in the source, initial 
seed length, and losses. The analysis combines several 
ingredients from the security proof of quantum key distri¬ 
bution (QKD), a rich subject that has developed over the 
last two decades. These ingredients include phase error 
correction [3Qj, random sampling M , and the squashing 
model [32|. Since the squashing model shows the equiv¬ 
alence between threshold detectors and qubit detectors 
[IH, our scheme allows the source to have an unfixed fi¬ 
nite dimension as well as an infinite dimension. For sim¬ 
plicity, in the rest of the paper, we assume a two-level 
(bit) output system. All our techniques can be directly 
applied to cases with more outputs. 

In many theoretical aspects, there are strong similari¬ 
ties between QKD and QRNG. For example, the security 
definition in QKD can be applied to the definition of ran¬ 
domness in QRNG, and similar proof techniques can be 
applied to both, as we do in the later analysis. However, 
in some practical scenarios, there are subtle differences 
between the two. For example, local randomness is free 
in QKD but not in QRNG. A more crucial difference 
lies in the trustworthy components of QKD and QRNG 
in practice. In QKD, the sender and receiver are two 
remotely separated parties, so an adversary could inter¬ 
cept and resend the transmitted signals in the quantum 
channel and then take advantage of the imperfections of 
measurement devices to perform attacks. Thus, com¬ 
pared to the source, the measurement device becomes a 
more vulnerable part of a QKD system. 

Different from QKD, source and measurement devices 
in QRNG are normally local, so attacks aimed at imper¬ 
fections in measurement devices seem more artificial than 
practical. The main purpose in studying the untrusted 
device scenario in QRNG is to address device imperfec¬ 
tions. This subtle difference may lead to deviations be¬ 
tween QKD and QRNG. For instance, it is reasonable 
to assume that Alice can characterize the measurement 
device for QRNG well and trust it during random num¬ 
ber production. Furthermore, compared to QKD, the 
source in QRNG involves a complicated design so that 
the QRNG is fast and convenient. For instance, in a re¬ 
cent experiment [33j], a QRNG was demonstrated based 
on measuring light-emitting diode (LED) light with a 
mobile phone. Such sources are hard to characterize and 
could possibly be manipulated by Eve, but one can rea¬ 
sonably trust one’s own mobile phone. From this view¬ 
point, the source in QRNG is at least as problematic as 
the measurement. Thus in our work, we take the rea¬ 
sonable assumption that the measurement device can be 
characterized well but not the source. Note that the op¬ 
posite scenario, where the source rather than the mea- 
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surement device of QRNG is trusted, has also been re¬ 
cently investigated [34| . 

To show the practicality of the proposed scheme, we 
provide a proof-of-principle experimental demonstration 
by simply modifying a QKD system. We experimen¬ 
tally examine the effect of different detector efficiencies 
on the randomness generation rate. Under a practical 
total transmittance, a high randomness generation rate 
can be achieved. 

The organization of the paper is as follows. In Sec¬ 
tion El we present our protocol. In Section Ell we an¬ 
alyze the protocol and calculate the min-entropy of its 
output after investigating various practical scenarios. In 
Section IIVI an experimental demonstration of our proto¬ 
col is performed. Finally, we conclude in Section IVl 


II. PROTOCOL 

A schematic of our SIQRNG protocol is shown in 
Fig. Ea). Here, we take an optical implementation as 
the example, as shown in Fig. Eb)- All our results ap¬ 
ply similarly to other implementation systems. Quantum 
signals from the source first go through a modulator that 
actively chooses between the X and Z bases. Then, a 
polarizing beam splitter and two threshold detectors per¬ 
form a projective measurement. Since two detectors are 
used, there are four possible outcomes: no clicks (losses), 
two single clicks, and double clicks. This implementa¬ 
tion is equivalent to the schematic setup of the squashing 
model as discussed in Section IIII Al The details of the 
protocol are presented in Fig. (T) 



FIG. 2. (a) Measurement model for SIQRNG. The quantum 
state first passes through a squasher and is projected as either 
a qubit or a vacuum. Then, the output qubit is measured in 
the X or Z basis chosen by an active switch. There are two 
outcomes for each basis measurement, corresponding to the 
two eigenstates of the basis, (b) An optical implementation of 
the SIQRNG in (a), as discussed in Section fill Al Here Pol-M 
refers to a polarization modulator, PBS refers to a polarizing 
beam splitter, and Do and D\ are the threshold detectors. 


1. Source: An untrusted party, Eve, prepares many 
quantum states in an arbitrary and unknown di¬ 
mension and feeds them into the measurement box 
of Alice. 

2. Squashing: Alice (or Eve) squashes the quantum 
states into qubits and vacua. Alice postselects the 
vacua and obtains n squashed qubits. The vacuum 
components take into account optical losses and 
quantum efficiencies. 

3. Random sampling: By consuming a short seed 
with the length given in Eq. ©. Alice randomly 
chooses n x out of the n squashed qubits and mea¬ 
sures them in the X basis, each results in 1+) or 

I-)- 

4. Parameter estimation: When the system oper¬ 
ates properly, the source emits qubits |+) for all 
runs. Thus, a result of |—) in the X-basis mea¬ 
surement is defined as an error. A double click is 
considered to be half an error. Alice evaluates the 
bit error rate eb x in the X basis and its statistical 
deviation 9 according to Eq. ([5)1. If eb x + 8 > 1/2, 
Alice aborts the protocol. 

5. Randomness generation: For the remaining n— 
n x squashed qubits, Alice performs measurement 
in the Z basis to generate n z = n — n x random 
bits. 


Randomness extraction: Alice picks a param¬ 
eter t e according to the desired failure probability 
restriction and extracts n z — n z H(eb x +9) —t e bits 
of final randomness using Toeplitz-matrix hashing 


6 . 


7. Security parameter: With the compos- 

able security definition, the security parameter 
(in trace-distance measure) is given by e = 
1 /(£e + 2- t «)(2-e e -2-‘Q. 


a Other extraction methods, such as Trevisan’s extractor 
[37l| can be applied, in which the relation between the 
failure probability and t e can differ. 

TABLE I. Source-independent QRNG with the finite data size 
effect. The results are proven in Section ED 


III. ANALYSIS 


In this section, we analyze the randomness output of 
the SIQRNG protocol. Strictly speaking, like device¬ 
independent QRNGs, our scheme is a randomness ex¬ 
pansion scheme, in which a random seed is used to gen¬ 
erate extra independent randomness. The procedure of 
parameter estimation is an analog to the^ phase error 
rate estimation in QKD postprocessing [36]. Random¬ 
ness extraction is mathematically equivalent to privacy 
amplification in QKD. The difference between the biased 
measurement used here and the biased-basis choice QKD 
protocol [38} is that the number of X-basis measurements 
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is a constant in our case, whereas in QKD, this number 
must go to infinity when the data size is infinitely large. 


A. Squashing model 


In the SIQRNG scheme, we assume that measurement 
devices are trusted and well characterized. The key as¬ 
sumption here is that the measurement setup is compat¬ 
ible with the squashing model. In other words, a mea¬ 
surement can be treated in two steps. First, the (un¬ 
known arbitrary-dimensional) signal state emitted from 
the source is projected to a qubit or vacuum. The pro¬ 
jection is called a squasher, as shown in Fig.[2][a). Then, 
the squashed qubits are postselected by discarding the 
vacua and measuring them in the X or Z basis. This 
assumption can be satisfied when threshold detectors are 
used with random bit assignments for double clicks [32|. 
For the protocol described in Section[TTJ the X-basis mea¬ 
surement results are used for parameter estimation and 
are then discarded in postprocessing. Thus, the random 
assignment can be replaced by adding half of the double¬ 
click ratio to the X-basis error rate. 

In practice, it is a challenge to verify whether a mea¬ 
surement setup is compatible with the squashing model. 
Much effort has been put into this question [HJ. The key 
point here is to make the two detectors respond equally 
to (four) different qubits, and hence make the measure¬ 
ment device basis independent [40|. This can be done by 
adding a series of filters (including spectrum and tem¬ 
poral filters) before the threshold detectors, to ensure 
that the input states stay within a proper set of opti¬ 
cal modes 4l|, in which the detectors have the same 
efficiencies 32J, |40j. One can further assume that Al¬ 
ice uses a trusted source to calibrate the measurement 
devices beforehand; that is, Alice performs a quantum 
measurement tomography. A similar measurement cali¬ 
bration procedure should be done in most current QKD 
and QRNG realizations. Here, we emphasize that the 
verification of the squashing model does not affect the 
source-independent property of our scheme. Thus, we 
leave detailed investigation on validating the measure¬ 
ment setting for future works. 

Similar to the QKD case [32], we can assume that 
the squashing operator is held by Eve in the random¬ 
ness analysis. By this, we mean that Eve can choose 
a valid operator, so long as the output is a qubit or a 
vacuum. In the following discussions, we focus on the 
squashed qubits. We need to determine the min-entropy 
associated with these qubits in the Z-basis measurement. 


B. Complementary uncertainty relation 


obtain a perfect state of |+). Now, the key question 
for Alice becomes how to verify that the source faith¬ 
fully emits the state |+). This can be done by borrowing 
a similar technique from the security analysis of QKD 
(30t IS HI] and consider an equivalent virtual protocol 
depicted in Fig. |TT[ where we replace steps 5 and 6 by 5' 
and 6'. In steps 3 and 4 of the protocol, Alice occasion¬ 
ally performs the X-basis measurement and defines the 
phase error rate to be the ratio of detecting |—). In the 
virtual protocol, once Alice knows the phase error rate 
by random sampling tests, she can perform a phase error 
correction (step 5') before the final Z-basis measurement 
(step 6'). From the smart design of the phase error cor¬ 
rection procedure [30], Alice can make it commute with 
the Z-basis measurement. Thus, she can perform the Z- 
basis measurement (step 5) first and then apply random¬ 
ness extraction (step 6). At this stage, all the states have 
already collapsed to classical results, and the phase error 
correction procedure becomes randomness extraction (or 
privacy amplification in QKD) [3C| IH,HU- Besides QKD, 
the argument here is similar to the one used in Ref. [44| , 
where one can consider the error correction process 5' as 
distilling coherence or randomness extraction. 


5’ Error correction: Based on phase error rate ebx, 
Alice performs phase error correction and obtains 
n 2 [l — H(ebx)] copies of perfect |+) with a nearly unit 
probability. 

6’ Randomness generation: After obtaining all states 
in |+), Alice performs measurement in the Z basis to 
get n z [ 1 — H(ebx)] random bits. 


TABLE II. An equivalent protocol of source-independent 
QRNG. 

It has been proved that the phase error correction 
(randomness extraction) can be efficiently done with 
Toeplitz-matrix hashing [35j. Suppose the number of 
qubits measured in the Z basis is n z and the phase er¬ 
ror rate is e pz , the number of bits sacrificed in the phase 
error correction is given by 

n z H(e pz ) T t e , (1) 

and the probability that the phase error correction fails 
is 2~ te [36|. Here, H(e) = — eloge — (1 — e) log(l — e) is 
the binary Shannon entropy function, all the log is base 
2 throughout the paper, and t e is the parameter Alice 
picks up by balancing the failure probability and the final 
output length. Then, the number of final random bits is 
given by, 


First, we show intuitively why the protocol works. Ac¬ 
cording to quantum mechanics, the outcome of project¬ 
ing the state |+) on the Z basis is random. Of course, 
in reality, due to device imperfections, Alice would never 


K > n z - n z H(e pz ) - t e . (2) 

In practice, Alice needs to prepare a Toeplitz matrix of 
size n z x [n z — n z H(e pz ) — t e \ for randomness extraction. 
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We note that the failure probability 2~ te quantifies fi¬ 
delity between the state that results from the phase error 
correction and the ideal state I n the composable 

security definition (45), [I(|, a trace-distance measure se¬ 
curity parameter e t should be employed. Its relation to 
the fidelity measure e/ is given by |3lJ 

£ t = yje f {2-e f ) (3) 

In the following, we use the fidelity measure for the fail¬ 
ure probability, which, in the end, can be conveniently 
converted to the trace-distance measure security param¬ 
eter. 

To construct the Toeplitz matrix of size n z x [n z — 
n z H{e pz ) — t e ], Alice needs to use n z + n z ~ n z H(e pz ) — 
t e — 1 random bits. Thanks to the Leftover Hash Lemma 
[471 , the Toeplitz hashing extractor can be proven to be 
a strong extractor. That is, the output random bits are 
independent of the random bits used in the construction 
of the Toeplitz matrix [48[. Thus, the Toeplitz matrix 
can be reused. 

Our result can also be derived via a different but ele¬ 
gant approach by employing a newly developed seminal 
uncertainty relation [49j and extending the Leftover Hash 
Lemma [13] to the quantum scenario [50]. Interestingly, 
the result from that approach yields a security param¬ 
eter (in trace-distance measure) that is of the order of 
2 -te / 2 , which is consistent with ours. Such techniques 
have been successfully applied in some applications, in¬ 
cluding QRNGs [23j . 

C. Finite key analysis 

In practice, the QRNG only runs for a finite time; con¬ 
sequently, the sampling tests for the X-basis measure¬ 
ments will suffer from statistical fluctuations. In the pa¬ 
rameter estimation step, the key parameter e pz in Eq. © 
should be estimated (bounded) from the finite data size 
effect. 

In the random sampling test, Alice measures the 
squashed qubits in the X basis and obtains the error 
rate, eb x • Remember that, as required in the squashing 
model, this error rate includes half of the double-click ra¬ 
tio. Henceforth, we simply call this error rate the X-basis 
error rate. Recall that the phase error rate e pz is defined 
as the error rate if the quantum signals measured in the 
Z basis are measured in the X basis. When the sampling 
size is large enough, e pz can be well approximated by eb x . 

Before presenting the details of the random sampling 
analysis, we establish a notation. Suppose Alice receives 
n squashed qubits and randomly chooses n x of them to 
be measured in the X' basis, leaving the remaining n z = 
n — n x qubits in the Z basis. Let the ratio of X-basis 
measurements be q x = n x /n, the number of errors Alice 
finds in the X basis be k, and the total number of errors 
be to if Alice measures all qubits in the X basis. Then, 
the number of errors in the qubits measured in the Z 


basis is m — k, which is the key parameter we need to 
determine through random sampling. The quantity m — 
k = n z e pz determines the randomness extraction rate. 
Define the lower bound of e pz by, 

&pz A Cb x T 9 , (4) 

where 9 is the deviation due to statistical fluctuations. 
Following the random sampling results of Fung et al. 3l|, 
we can bound the probability when Eq. Jp fails, 

eg = Prob(e pz > e bx + 9) 

< 1 ( 5 ) 

\/ 9xiX Qx)^bx{, 1 &bx)n 

where £(0) = H(eb x +9-q x 9)-q x H(e bx )-(l-q x )H(ebx + 
9). Note that in the unlikely event that eb x = 0, the 
failure probability is unbounded, and one should rederive 
the failure probability or simply replace eb x with a small 
value, say, l/n x . 

In practice, the failure probability eg is normally picked 
to be a small number depending on applications. In later 
data postprocessing, we pick up eg = 2 -100 . Once eg is 
fixed, there is a trade-off between q x and 9 for the ra¬ 
tio of the final random bit length over the raw data size. 
Thus, the number of samples for the X-basis measure¬ 
ment should be optimized for the randomness extraction 
rate. 

One key property for the random sampling is that the 
n x locations of the X-basis measurements are randomly 
chosen from the total n locations, i.e., the ( n ) cases are 
equally likely to occur. Then, Alice needs a random seed 
with a length of 

n see d = log ( U ) < n x log n. (6) 

\n x J 

The effect of loss on the seed length will be discussed in 
Section IIIIDI In Appendix [© we show that n x can re¬ 
main a constant, given the failure probability, when n is 
large. Then, in the large data size limit, the seed length is 
exponentially small compared to the length of the output 
random bit. Therefore, we reach an exponential random¬ 
ness expansion. 


D. Practical issues 

Multiphotons: In our protocol, the source is allowed 
to emit multiphotons, since its dimension is assumed to 
be uncharacterized. In other words, these components do 
not affect the randomness of the final output. In practice, 
multiphotons may introduce double clicks when thresh¬ 
old detectors are used ,32|; these double clicks will di¬ 
rectly contribute to the error rate term eb x . Thus, when 
the multi-photon ratio is very high, the double-click ratio 
will increase to a point at which the upper bound on in¬ 
formation leakage e pz increases to one-half; at that point, 
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no random bits can be extracted according to Eq. m and 
Alice simply aborts the protocol. 

Loss: The loss tolerance of our protocol is guaranteed 
by the squashing model in which the measurement is as¬ 
sumed to be basis independent [32) . This assumption can 
be guaranteed by the fact that the basis is chosen after 
losses. Alice does not anticipate the positions of losses, so 
she effectively decides the (random) positions for X-basis 
measurements before losses. The effect of loss only de¬ 
creases the number of effective X measurements, but the 
positions of effective X measurements are still uniformly 
random in squashed qubits; this fulfills the requirement 
of random sampling. The detailed proof is shown in Ap¬ 
pendix [B] 

Basis-dependent detector efficiency: Our proto¬ 
col assumes that the efficiencies of the detectors are the 
same. In practice, efficiency mismatches would cause the 
measurement to be different for the two bases (basis de¬ 
pendent). A viable way to deal with this imperfection is 
to recalculate the rate as a function of the ratio between 
the efficiencies of the two bases, employing the technique 
used in QKD (401 • As indicated by the result in QKD [4(|, 
the random number generation rate will slightly decrease 
when there is a small mismatch in detector efficiencies. 
More precisely, we denote the ratio between the minimum 
and maximum efficiencies of the two detectors as r < 1, 
then the key size becomes rn z { 1 — H[(eb x + 0)/r\) — t e 
bits. We leave detailed analysis of this imperfection for 
future work. 

Double clicks: Our analysis takes account of the ef¬ 
fect of double clicks by adding half of the double-click ra¬ 
tio to the X-basis error rate, as required in the squashing 
model. This is also essentially why multiphoton states 
can be used on the source side without affecting final 
randomness. Note that double clicks should not be dis¬ 
carded freely in the measurement. Otherwise, a security 
loophole will appear, namely, a strong pulse attack |5l| . 
In a strong pulse attack, Eve always sends strong sig¬ 
nals (with many photons) in the Z basis. Suppose she 
sends a strong state in |if); if Alice chooses the Z-basis 
measurement, a valid raw random bit will be obtained, 
but if she chooses the X basis, a double click is likely to 
happen. In our protocol, when Alice chooses the X-basis 
measurement, she should get an error (resulting in |—)) 
with a probability of one-half. If Alice simply discards all 
double clicks, Eve’s attack will not be noticed. This at¬ 
tack cannot be explained by a qubit measurement. This 
is intuitively why the squashing model requires random 
assignments for double clicks. 

Basis choice: When choosing X- or Z-basis measure¬ 
ments, an input random string of length N (as a seed) is 
needed. Suppose the number of X-basis measurements 
to be performed is N x , then Alice chooses N x positions 
out of N with equal probability, i.e., with probability 

) . Then, she needs a seed length of log ). This 

is similar to Eq. © with the difference that before the 
measurement, Alice does not know the positions of losses. 
More details on how to dilute a short random seed to a 


longer (partially random) one are provided in Appendix 

o 

Intensity optimization: The intensity of the source 
should be optimized to maximize the randomness gener¬ 
ation rate. With increasing intensity, the detection rate 
will increase along with an increases in the double-click 
rate (and hence e pz increases). There exists a trade-off 
between n z and e pz , as shown in Eq. I®. 

IV. EXPERIMENT DEMONSTRATION 

In this section, we perform a proof-of-principle exper¬ 
imental demonstration to show the practicality of the 
SIQRNG scheme. Our experiment setup consists of two 
parts, the source, owned by an untrusted party Eve, and 
the measurement device, owned by the user Alice. The 
schematic diagram is shown in Fig. [3] 


Eve Alice 



FIG. 3. Experiment setup of SIQRNG. S: laser source; LP: 
linear polarizer; FPC: fiber polarization controller; FA: fiber 
attenuator; BS: beam splitter; PBS: polarizing beam splitter; 
TD: time delay implemented with a 12 m fiber; PD: photon 
detector. 

On Eve’s side, a laser, labeled as S, with a wavelength 
of 850 nm and a repetition rate of 1 MHz is used as a 
photon source. The power of the laser is adjusted to be 
one photon per pulse. Instead of assuming each state 
is a qubit system, each pulse that the laser sends is a 
coherent state of infinite dimensions. The pulse of the 
laser is then modulated to |+) polarization by a linear 
polarizer (LP) and a fiber polarization controller (FPC1). 
Between the source and the measurement device, we put 
a fiber attenuator (FA) to simulate different losses in the 
system. 

On Alice’s side, first a series of filters needs to be ap¬ 
plied to ensure the measured optical mode is pure be¬ 
fore entering the threshold detectors, as required by the 
squashing model. For demonstration purposes, we use 
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a single-mode fiber to play the role of a filter. Ideally, 
frequency and temporal filters should also be added to 
further purify the optical mode in order to make the 
photons indistinguishable. For demonstration purposes, 
a biased beam splitter (BS1) with a ratio of 1 : 49 is 
used to passively choose the X or Z basis. Finally, Alice 
records when the photon detector (PD) clicks. The de¬ 
tector is time-division-multiplexed by adding four time 
delays TD1-TD4 (60 ns each) in the optical paths, so 
that it can simulate four detectors that detect the out¬ 
comes of both bases and each bit value. The gate width 
and the dead time of the detector are 10 ns and 50 ns, 
respectively. 

The phase error rate, as calculated in Eq. 0, is plotted 
in Fig. [4j The typical values of the related experimental 
parameters are listed as follows. The raw key size is 
N = 10 6 ; the dark count is 0.002; the detector efficiency 
(without a FC adaptor) is 45%; the misalignment error 
of the source is 2%; and the failure probability is eg = 
2 ~wo T]; le fig ure shows that the error rate increases 
as the loss becomes large. This is because the effect of 
dark counts becomes dominant when the loss is high. 
Because of statistical fluctuations, the phase error rate 
increases when the data size shrinks. Note, in particular, 
that the phase error rate can go beyond 20% under high 
losses, which does not yield any key rates in most QKD 
protocols. Nevertheless, random numbers can still be 
generated in our SIQRNG scheme. 



FIG. 4. Relation between the phase error rate and the loss. 
The big error bars are caused by a very conservative esti¬ 
mation of statistical fluctuations and also partially by the 
fluctuation of experimental parameters for different losses. 

The relation between the randomness generation rate 
and the loss is plotted in Fig. [5] It can be seen that 
the randomness generation rate becomes lower with a 
larger loss, which is consistent with Fig. [4] Under prac¬ 
tical detector efficiency, the randomness generation rate 
still achieves a relatively high rate of 5 x 10 3 bit/s. Note 
that, the intensity of the source is fixed in our experimen¬ 
tal demonstration. In practice, the intensity of the source 
can be increased to compensate the loss, and actually the 


maximum randomness generation rate in our scheme is 
mainly limited by the dead time of the detector. For our 
detector with a dead time of 50 ns, the maximum ran¬ 
domness generation rate is 1 bit/50 ns=20 Mbps, which 
requires the source to be a single photon source with a 
repetition rate of 20 Mbps. For practical implementa¬ 
tions with coherent-state sources, the randomness gener¬ 
ation rate can reach the order of 2 Mbps after taking into 
account various errors and finite data size effects. 



Loss (dB) 

FIG. 5. Dependency of randomness generation rate on the 
loss. The data points on the figure are taken to be the lower 
bound of the rate, evaluated by random sampling. The secu¬ 
rity parameter is Et = 2 x 2“ 50 

After obtaining the random bits, we apply the 
Toeplitz-matrix hashing fdbf on the raw data to obtain 
final random numbers. To test the randomness, we fur¬ 
ther perform two statistical tests on the output of our 
SIQRNG, the autocorrelation test and the NIST test 
suite [52|. The autocorrelation is defined as 

R{j) = E[(Xi ~ ~ ( 7 ) 

cH 

where j is the lag between the samples, X, is the i-th 
sample bit, /r and cr are the average and the variance 
of the sample, and E stands for expectation. The result 
of the autocorrelation test of raw data and final data is 
shown in Fig. [6] It can be seen that the autocorrelation 
is substantially reduced in the final data. The result of 
NIST tests on the final data is shown in Fig. [7] We can 
see that all tests are passed. 

V. CONCLUSION 

We have proposed a source-independent and loss- 
tolerant QRNG scheme and its experimental demonstra¬ 
tion in a passive basis choice realization. From an exper¬ 
imental point of view, the beam splitter itself, as part 
of the measurement device, may also be uncharacter¬ 
ized. Thus, it would also be interesting to demonstrate 











Lag Lag 


FIG. 6. The autocorrelation function of the raw data and the 
final data. The x axis is the lag j between the sampled data X % 
and A 'i+j, while the y axis is the autocorrelation R(j) defined 
in Eq. ©. Data sizes of both the raw data and the final data 
are on the order of 10 7 . The autocorrelation of the final data 
is significantly smaller than the raw data in absolute value. 
Because of finite-key-size effects, the autocorrelation cannot 
be zero even for perfectly random strings. 



FIG. 7. The P value of the statistical tests. The x axis lists 
the names of statistical tests in the NIST test suite. The final 
data size is 91 Mbit, which is extracted from 115-Mbit raw 
data. To pass each test, the P value should be at least 0.01 
and the proportion of sequences that satisfy P > 0.01 should 
be at least 96%. It can be seen in the figure that the P values 
of all tests are greater than 0.01. 


our scheme with an active basis choice in the future. In 
fact, when the source operates properly, the speed of our 
protocol is comparable to that of a trusted polarization- 
based QRNG whose frequency is limited only by single 
photon detectors—approximately 100 Mbps [53j . 

Some current realizations of QRNG experiments could 
be converted to our SIQRNG protocol. For example, a 
LED could be used as the source, as regular QRNG [331 ]. 
Since the polarizations of a LED are random, it would 
be convenient to add a polarizer for the |+) direction to 
make the source-polarized light. Since the detector can 


work in a gated mode, it does not matter whether the 
light source is continuous or pulsed. This shows why the 
repetition rate is limited only by single-photon detectors. 
Viewed from another angle, such a setup could also be 
used to test quantum features of macroscopic sources. 

For future projects, it would be interesting to investi¬ 
gate other loss-tolerant self-testing QRNG schemes. Es¬ 
sentially, we are aiming to design a QRNG to tolerate 
large losses and generate fast random numbers simulta¬ 
neously, given the minimum assumptions of a practical 
setup. 

Added note: Upon completion of this work, we no¬ 
ticed a related work [23| , the uncertainty relation is em¬ 
ployed to quantify entropy in QRNG and finite-key ef¬ 
fects are taken into consideration with smooth min en¬ 
tropies. The work also aimed at provable randomness 
with untrusted sources. However, it makes a strong as¬ 
sumption on the dimension of the source, which turns 
out to be the key barrier for source-independent QRNG. 
Moreover, the practical imperfections, such as multi¬ 
photons, device imperfections and losses, are not con¬ 
sidered. Our work, on the other hand, use the squashing 
model for arbitrary dimension system and take account 
of imperfections in practical scenarios. 
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Appendix A: Calculation of the number of effective 
A'-basis measurements 

In this appendix, we show that in the asymptotic limit, 
the number of effective -Y-basis measurements is indepen¬ 
dent of n. Our starting point is Eq. © and eg < 2 100 . 
Notice that normally n is smaller than 10 12 < 2 40 to ease 
fast postprocessing; thus, the term 1 /y/n and the other 
polynomial terms in Eq. j5]) play a relatively small role 
in making eg < 2 -100 . In the following, we consider only 
the exponent in Eq. ©■ 

For ease of notation, let x = eb x , y = e& x + 9 and 
q = q x . Then the exponent of Eq. © becomes 

n[H((l - q)y + qx) - qH{x) - (1 - q)H(y)] 
and the inequality eg < 2“ 100 is approximately equiva- 
















9 


lent to 

n[q(H{{\-q)y + qx)-H{x))+ 

{l-q)(H((l-q)y + qx)-H(y))]>100. [ ’ 

Since q is very small, one can make three approximations: 

H{{ 1 - q)y + qx) - H{y) « -H'{y)q(y - x), (A2) 

q[H((l- q)y + qx) - H (x)] « q(H(y) - H(x)) (A3) 


applied when the measurement basis is chosen before the 
loss. 

For ease of presentation, we state the input that spec¬ 
ifies the measurement choices before the loss as follows. 
The input is a string of length N = N x + N z that con¬ 
tains N x Os and N z Is. The (^) possibilities for choosing 
the positions of N z Is from the total N x + N z positions 
are equally likely. Here, 0 stands for an X-basis measure¬ 
ment and 1 stands for a Z-basis measurement. After loss, 
the numbers of valid X-basis measurements and Z-basis 
measurements are denoted by n x and n z , respectively, 
with a total string length of 


q 2 « 0. 


(A4) 

and (IA3I) . the inequality 


Then, by applying Eqs. 
(ED becomes 


n[q{H{y)-H(x))-{l-q){H'(y)q(y-x))\ > 100. (A5) 
Applying Eq. (IA4D yields 

n[q(H(y) - H(x)) - H'(y)q(y - x)] > 100, (A6) 

and rearranging terms, we have 

> 100 
Q ~ n[H(y) -H(x)~ H'(y){y - x )\' 

Substituting the definitions of x and y , we obtain 

100 

q ~ n[H(e bx + 9) - H(e bx ) - H'{e bx + 9)0]' 

Finally, we substitute q = n x /n and get 

100 

x H{e bx + 8)- H(e bx ) - H'(e bx + 9)9 ’ 
which is independent of n. 


(A7) 


(AS) 


(A9) 


Appendix B: Proof of the random sampling 
property for a type of QRNG input after loss 


n = n x + n z . 


(Bl) 


We need to show that the output is uniform for the 
(«*+«*) possibilities of choosing the positions of n z Is 
from the total n positions. 

The proof proceeds through a symmetry argument. 
The input is symmetric, i.e., if we exchange the indices of 
two positions, the distribution will not change. Suppose 
that the initial positions are 1,2,... ,n and the probabil¬ 
ity of choosing specific positions for N z Is from the total 
N positions is 

1 , N 

P ~ (N x +N z \ ' (®2) 

\ N z ) 

For ease of presentation, denote the left positions after 
loss as ii < *2 < • • • < i n - Then each possibility with n x 
0s in the left n positions has the same probability 


Pi=px 



(B3) 


which proves our claim. 

As a side remark, we see that the proof does not de¬ 
pend on whether the loss is basis dependent or indepen¬ 
dent. Thus, the same property also holds for a more gen¬ 
eral class of losses that could be useful in other settings. 
Another remark is that independent and identically dis¬ 
tributed input also satisfies the property, as in the work 
of Fung et al. 


In this appendix, we first restate the setting. In the 
idealistic protocol, the measurement device chooses its 
measurement basis after confirming that the state re¬ 
ceived from the source is not a vacuum (or equivalently, 
not lost). In practice, confirming whether a state is a 
vacuum is usually done by observing whether detectors 
in the measurement device click or not. Thus, it is desir¬ 
able for the measurement device to choose its basis before 
confirming whether loss happens. 

We prove that for a specific input that defines the mea¬ 
surement basis choices before the potential loss, the posi¬ 
tions of n x valid X-basis measurements (after excluding 
loss events) are randomly drawn from the positions of 
the total of n valid measurements. This proves that the 
random sampling technique from Fung et al. can still be 


Appendix C: Random seed dilution 

The input is either given directly or expanded from a 
uniformly random seed. Here, we provide a method for 
performing the expansion. The expansion is straightfor¬ 
ward since the input is also uniformly random within its 
support. We can simply map a uniform seed of length 
log (^) bijectively to the input support, which is the (^) 
possibility of choosing the positions of C\ 0s from the 
string of length N. Then, we obtain the desired input. 
Furthermore, note that this construction is deterministic; 
thus, input randomness is only needed for the uniformly 
random seed of length n. 

For the input of our protocol, the ratio of the initial 
random seed length to the number of runs N becomes 
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negligible as N goes to infinity because the number of 
X-basis measurements C\ is a constant, as derived in Ap¬ 
pendix [A] More precisely, the min entropy of the input 
as well as the length of the uniformly random seed has 
an upper bound given by 

log f < ci log N. (Cl) 

Note that since the detector completely controls this ran¬ 


dom seed length, calculating the exact input min entropy 
is possible. This is very different from estimating the er¬ 
ror rate in the finite-key analysis section, in which we 
can only estimate the range of the error rate with a high 
probability of success. Apart from the input specified in 
the main text, independent and identically distributed 
bit strings are also a possible choice for the input. Fi¬ 
nally, we remark that the reason to include this input 
seed length analysis is to make our QRNG composable. 
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